RDG for DPF with OVN-Kubernetes and HBN Services

DPF Operator Installation

Cert-manager is a powerful and extensible X.509 certificate controller for Kubernetes workloads. It will obtain certificates from a variety of Issuers, both popular public Issuers as well as private Issuers. It will ensure the certificates are valid and up-to-date and will attempt to renew certificates at a configured time before expiry.

In this deployment, it's a prerequisite used to provide certificates for webhooks used by DPF and its dependencies.

  1. Install Cert-manager using helm.

    1. The following values will be used for the helm chart installation:

      manifests/02-dpf-operator-installation/helm-values/cert-manager.yml

      Copy
      Copied!
                  

      startupapicheck: enabled: false crds: enabled: true affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: node-role.kubernetes.io/master operator: Exists - matchExpressions: - key: node-role.kubernetes.io/control-plane operator: Exists tolerations: - operator: Exists effect: NoSchedule key: node-role.kubernetes.io/control-plane - operator: Exists effect: NoSchedule key: node-role.kubernetes.io/master cainjector: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: node-role.kubernetes.io/master operator: Exists - matchExpressions: - key: node-role.kubernetes.io/control-plane operator: Exists tolerations: - operator: Exists effect: NoSchedule key: node-role.kubernetes.io/control-plane - operator: Exists effect: NoSchedule key: node-role.kubernetes.io/master webhook: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: node-role.kubernetes.io/master operator: Exists - matchExpressions: - key: node-role.kubernetes.io/control-plane operator: Exists tolerations: - operator: Exists effect: NoSchedule key: node-role.kubernetes.io/control-plane - operator: Exists effect: NoSchedule key: node-role.kubernetes.io/master

    2. Run the following commands:

      Jump Node Console

      Copy
      Copied!
                  

      $ helm repo add jetstack https://charts.jetstack.io --force-update $ helm upgrade --install --create-namespace --namespace cert-manager cert-manager jetstack/cert-manager --version v1.16.1 -f ./manifests/02-dpf-operator-installation/helm-values/cert-manager.yml   Release "cert-manager" does not exist. Installing it now. NAME: cert-manager LAST DEPLOYED: Tue May 20 12:59:30 2025 NAMESPACE: cert-manager STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: cert-manager v1.16.1 has been deployed successfully!

  2. Verify that all the pods in cert-manager namespace are in ready state:

    Jump Node Console

    Copy
    Copied!
                

    $ kubectl wait --for=condition=ready --namespace cert-manager pods --all pod/cert-manager-6ffdf6c5f8-7k7zz condition met pod/cert-manager-cainjector-66b8577665-fgcqg condition met pod/cert-manager-webhook-5cb94cb7b6-9rk9m condition met

  1. Download local-path-provisioner helm chart to your current working directory and create a NS for it:

    Jump Node Console

    Copy
    Copied!
                

    $ curl https://codeload.github.com/rancher/local-path-provisioner/tar.gz/v0.0.30 | tar -xz --strip=3 local-path-provisioner-0.0.30/deploy/chart/local-path-provisioner/ $ kubectl create ns local-path-provisioner

  2. The following values will be used for the installation:

    manifests/02-dpf-operator-installation/helm-values/local-path-provisioner.yml

    Copy
    Copied!
                

    tolerations: - operator: Exists effect: NoSchedule key: node-role.kubernetes.io/control-plane - operator: Exists effect: NoSchedule key: node-role.kubernetes.io/master

    Run the following command:

    Jump Node Console

    Copy
    Copied!
                

    $ helm install -n local-path-provisioner local-path-provisioner ./local-path-provisioner --version 0.0.30 -f ./manifests/02-dpf-operator-installation/helm-values/local-path-provisioner.yml   NAME: local-path-provisioner LAST DEPLOYED: Tue May 20 13:01:40 2025 NAMESPACE: local-path-provisioner STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: ...

  3. Ensure that the pod in local-path-provisioner namespace is in ready state:

    Jump Node Console

    Copy
    Copied!
                

    $ kubectl wait --for=condition=ready --namespace local-path-provisioner pods --all pod/local-path-provisioner-75f649c47c-fbccd condition met

  • Create the NS for the operator:

    Jump Node Console

    Copy
    Copied!
                

    $ kubectl create ns dpf-operator-system

  • The following YAML file defines storage (for the BFB image) that is required by the DPF operator.

    manifests/02-dpf-operator-installation/nfs-storage-for-bfb-dpf-ga.yaml

    Copy
    Copied!
                

    --- apiVersion: v1 kind: PersistentVolume metadata: name: bfb-pv spec: capacity: storage: 10Gi volumeMode: Filesystem accessModes: - ReadWriteMany nfs: path: /mnt/dpf_share/bfb server: $NFS_SERVER_IP persistentVolumeReclaimPolicy: Delete --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: bfb-pvc namespace: dpf-operator-system spec: accessModes: - ReadWriteMany resources: requests: storage: 10Gi volumeMode: Filesystem storageClassName: ""

  • Run the following command to substitute the environment variables using envsubst and apply the yaml file:

    Jump Node Console

    Copy
    Copied!
                

    $ cat manifests/02-dpf-operator-installation/*.yaml | envsubst | kubectl apply -f -

  1. The DPF Operator helm values are detailed in the following YAML file:

    manifests/02-dpf-operator-installation/helm-values/dpf-operator.yml

    Copy
    Copied!
                

    kamaji-etcd: persistentVolumeClaim: storageClassName: local-path node-feature-discovery: worker: extraEnvs: - name: "KUBERNETES_SERVICE_HOST" value: "$TARGETCLUSTER_API_SERVER_HOST" - name: "KUBERNETES_SERVICE_PORT" value: "$TARGETCLUSTER_API_SERVER_PORT"

    Run the following commands to substitute the environment variables and install the DPF Operator:

    Jump Node Console

    Copy
    Copied!
                

    $ helm repo add --force-update dpf-repository ${REGISTRY} $ helm repo update $ envsubst < ./manifests/02-dpf-operator-installation/helm-values/dpf-operator.yml | helm upgrade --install -n dpf-operator-system dpf-operator dpf-repository/dpf-operator --version=$TAG --values -   Release "dpf-operator" does not exist. Installing it now. NAME: dpf-operator LAST DEPLOYED: Tue May 20 13:18:58 2025 NAMESPACE: dpf-operator-system STATUS: deployed REVISION: 1 TEST SUITE: None

  2. Verify the DPF Operator installation by ensuring the deployment is available and all the pods are ready:

    Note

    The following verification commands may need to be run multiple times to ensure the conditions are met.

    Jump Node Console

    Copy
    Copied!
                

    $ kubectl rollout status deployment --namespace dpf-operator-system dpf-operator-controller-manager deployment "dpf-operator-controller-manager" successfully rolled out   $ kubectl wait --for=condition=ready --namespace dpf-operator-system pods --all pod/dpf-operator-argocd-application-controller-0 condition met pod/dpf-operator-argocd-redis-5bc74d76fc-dclfd condition met pod/dpf-operator-argocd-repo-server-86c9454fc9-5wwkw condition met pod/dpf-operator-argocd-server-554d9f446-sbz8b condition met pod/dpf-operator-controller-manager-67599cdcb7-mzsc8 condition met pod/dpf-operator-kamaji-6dcf4ccdfd-hdzwb condition met pod/dpf-operator-kamaji-etcd-0 condition met pod/dpf-operator-kamaji-etcd-1 condition met pod/dpf-operator-kamaji-etcd-2 condition met pod/dpf-operator-maintenance-operator-666b88bfcd-hx8h5 condition met pod/dpf-operator-node-feature-discovery-gc-656b95dc48-z9tld condition met pod/dpf-operator-node-feature-discovery-master-76d5695c7c-d6jlj condition met

© Copyright 2025, NVIDIA. Last updated on Jul 24, 2025.